A New Algorithm for Finding Closest Pair of Vectors

نویسندگان

  • Ning Xie
  • Shuai Xu
  • Yekun Xu
چکیده

Given n vectors x0, x1, . . . , xn−1 in {0, 1}m, how to find two vectors whose pairwise Hamming distance is minimum? This problem is known as the Closest Pair Problem. If these vectors are generated uniformly at random except two of them are correlated with Pearson-correlation coefficient ρ, then the problem is called the Light Bulb Problem. In this work, we propose a novel coding-based scheme for the Close Pair Problem. We design both randomized and deterministic algorithms, which achieve the bestknown running time when the minimum distance is very small compared to the length of input vectors. When applied to the Light Bulb Problem, our algorithms yields state-of-the-art deterministic running time when the Pearson-correlation coefficient ρ is very large. ∗An extended abstract of this article is to appear in Proceedings of the 13th International Computer Science Symposium in Russia (CSR’18). †Florida International University, Miami, FL 33199, USA. Email: [email protected]. Research supported in part by NSF grant 1423034. ‡Florida International University, Miami, FL 33199, USA. Email: [email protected]. Research supported in part by NSF grant 1423034. §Florida International University, Miami, FL 33199, USA. Email: [email protected]. Research supported in part by NSF grant 1423034. 0 ar X iv :1 80 2. 09 10 4v 2 [ cs .D S] 2 7 Fe b 20 18

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Closest Pairs Data Selection for Support Vector Machines

This paper presents data selection procedures for support vector machines (SVM). The purpose of data selection is to reduce the dataset by eliminating as many non support vectors (non-SVs) as possible. Based on the fact that support vectors (SVs) are those vectors close to the decision boundary, data selection keeps only the closest pair vectors of opposite classes. The selected dataset will re...

متن کامل

On the Difference Between Closest, Furthest, and Orthogonal Pairs: Nearly-Linear vs Barely-Subquadratic Complexity in Computational Geometry

Point location problems for n points in d-dimensional Euclidean space (and lp spaces more generally) have typically had two kinds of running-time solutions: (Nearly-Linear) less than d ·n log n time, or (Barely-Subquadratic) f (d) ·n2−1/Θ(d) time, for various functions f . For small d and large n, “nearly-linear” running times are generally feasible, while the “barelysubquadratic” times are gen...

متن کامل

On the Difference Between Closest, Furthest, and Orthogonal Pairs: Nearly-Linear vs Barely-Subquadratic Complexity

Point location problems for n points in d-dimensional Euclidean space (and `p spaces more generally) have typically had two kinds of running-time solutions: (Nearly-Linear) less than d · n log n time, or (Barely-Subquadratic) f(d) ·n2−1/Θ(d) time, for various f . For small d and large n, “nearly-linear” running times are generally feasible, while the “barely-subquadratic” times are generally in...

متن کامل

Optimum Partition Parameter of Divide-and-Conquer Algorithm for Solving Closest-Pair Problem

Divide and Conquer is a well known algorithmic procedure for solving many kinds of problem. In this procedure, the problem is partitioned into two parts until the problem is trivially solvable. Finding the distance of the closest pair is an interesting topic in computer science. With divide and conquer algorithm we can solve closest pair problem. Here also the problem is partitioned into two pa...

متن کامل

On Spatial-Range Closest-Pair Query

An important query for spatial database research is to find the closest pair of objects in a given space. Existing work assumes two objects of the closest pair come from two different data sets indexed by R-trees. The closest pair in the whole space will be found via an optimzed R-tree join technique. However, this technique doesn’t perform well when the two data sets are identical. And it does...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1802.09104  شماره 

صفحات  -

تاریخ انتشار 2018